R

Quantitative Methodology (UPF)

Jordi Mas Elias

https://www.jordimas.cat/

Summary

  • R Workflow
  • Objects in R
  • Functions in R
  • Types of data files

Warm up

Paint the fence, first…

Warm up

…karate later.

Warm up

Data wrangling

R

RStudio workflow

  • Install packages: Once in a year.
install.packages(c("dplyr", "ggplot", "tidyr", 
                   "readr", "readxl", "haven", "foreign"))
  • Load packages: Everytime you join R.
library(dplyr)
library(ggplot2)
library(readr)

R objects

Objects

  • A. Values
  • B. Vectors: c(value1, value2, value3, ...)
  • C. Dataframes: tibble(vector1, vector2, ...)

Objects. General rules

General rules for creating objects:

  • Can’t start with a number.
  • Can’t contain: ^, !, $, @, +, -, /, *.
  • Overwrittes if saving with the same name.
  • Case sensitive.

Objects. Vectors

Table 1: Class and type of vector
Class Type Example
Character Character c("b", "c", "d")
Factor Integer factor(c("b", "c", "d"))
Integer Integer c(10L, 6L, 12L)
Numeric Double c(1.1, 3.5, 10.2)
Data Double as.Date(c("2019/06/04", "2019/11/02", "2020/01/23"))
Logical Logical c(FALSE, TRUE, FALSE)

Objects. Rules

  • All values in a vector must be of the same type
  • All vectors in a dataframe must have equal length
  • Values in a vector can be selected with [ ]
  • Rows and columns in a df can be selected with [ , ]

Exercise

CHES Latin America

Codebook available here.

  • Glimpse the ches_la object. Rows? Columns?
  • Which countries are in the dataset?
  • Select Colombia and create a new object named ches_co.
  • Glimpse the new dataframe (how many observations?).
  • Select party variables, lrecon, galtan, crime, regions, ethnic_minorities.
  • Overwrite the object ches_co with the transformation.

Functions

Functions (I): General rules

General rules for using functions:

  • They can contain several arguments.
    • function(argument1, argument2 ...)
  • Normally, the first argument is a vector or a dataframe.
  • Use ? or other help to know how to use them.

Exercise: Afrobarometer - Nigeria

Codebook available here.
File: https://www.jordimas.cat/files/nig.csv

Functions (II): Without arguments

Normally, when they are related to the working environment.

ls()
installed.packages()
search()
getwd()

Functions (III): With one argument

Applied normally to a dataframe:

glimpse()
dim()
summary()

Applied normally to a vector:

#to a character vector
unique()
table()

#to a numeric vector
mean() 
hist()

Functions (IV): With many arguments

Exercise: Functions

sample()
seq()
rep()

Help!

Using R is impossible without help.

Import data

Import functions

File type Package Functions
csv readr read_csv() o read_csv2()
xls readxl read_xls()
xlsx readxl read_xlsx()
dta foreign read.dta()
dta haven read_dta()
sav haven read_sav()
spss haven read_spss()

Import functions

  • Package readr.
read_csv("data/gapminder.csv")
read_csv2("data/gapminder2.csv")
read_tsv("data/gapminder3.tsv")
read_delim("data/gapminder4.txt", delim = "/")
  • Other packages:
tibble(foreign::read.dta("data/gapminder5.dta"))
load("data/gapminder6.Rdata")
tibble(foreign::read.spss("data/gapminder7.sav", to.data.frame = T))
read_xlsx("data/gapminder8.xlsx", sheet = 2)